A Primal-Dual Parallel Approximation Technique 
Applied to Weighted Set and Vertex Cover 

Samir Khuller * Uzi Vishkin ' Neal Young * 



Abstract 

We give an efficient deterministic parallel approximation algorithm for the minimum- 
weight vertex- and set-cover problems and their duals (edge/element packing). The 
algorithm is simple and suitable for distributed implementation. It fits no existing 
paradigm for fast, efficient parallel algorithms — it uses only "local" information at 
each step, yet is deterministic. (Generally, such algorithms have required randomiza- 
tion.) The result demonstrates that linear-programming primal-dual approximation 
techniques can lead to fast, efficient parallel algorithms. The presentation does not 
assume knowledge of such techniques. 

Keywords: set cover, vertex cover, parallel algorithms, approximation algorithms. 



1 Introduction 

The linear-programming primal-dual method for obtaining sequential algorithms for exact 
optimization problems is well studied | |CJh83| ] . Primal-dual techniques have also been used to 



obtain sequential approximation algorithms (e.g., for NP-hard problems ||Ch79| , [Ho82| , etc 



and for on-line problems |[You|| ). In this paper, we apply primal-dual techniques to obtain a 



deterministic parallel approximation algorithm for the minimum-weight vertex- and set-cover 
problems and their duals, maximum-weight edge and element packing. 

The result demonstrates that linear-programming primal-dual techniques can lead to 
fast, efficient parallel algorithms. The algorithm is natural, yet fits no existing paradigm for 
such algorithms, being unique in that it uses only "local" information at each step, yet is 
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deterministic. Generally, such algorithms have required randomization (e.g., [|1186| , AB186 



Cu86| ). 

Given an n- vertex, m-edge graph G = (V, E) with vertex weights and an e > 0, our 
algorithm returns a vertex cover of weight at most 2/(1 — e) times the minimum. It uses 
0(ln 2 m In -) time and m/ In 2 m processors (i.e., 0(m In -) operations) on an EREW-PRAM. 
More generally, the algorithm finds a set cover of weight at most r/(l — e) times the minimum, 
using 0(r ln 2 m In-) time and M/ln 2 m processors (i.e., 0(r M In-) operations). Here m 
is the number of elements, M is the sum of the set sizes, and r is the maximum number of 
sets in which any element occurs. (For vertex cover, r = 2.) In each case, the algorithm 
also implicitly finds a near-maximal dual solution (an edge or element packing) that is also 
within the corresponding factor of optimal. 



The algorithm can be implemented using only integer arithmetic (see § [4.2|) . If the weights 
are integers and 1/e is less than the sum of the vertex (resp. set) weights, then the weight 
of the cover is at most 2 (resp. r) times the minimum. 

1.1 Related Work 

The first r- approximation algorithm for weighted vertex/set cover was due to Hochbaum 
[ Ho82| |. She considered the relaxation of the natural integer linear program for the problem. 



The dual of this program is maximum edge packing. The so-called complimentary-slackness 
conditions are that a (fractional) cover and a packing are optimal provided (i) every vertex 
in the cover has its constraint met in the packing and (ii) every edge with non-zero packing 
weight has exactly one vertex in the cover. Hochbaum observed that an optimal packing was 
necessarily maximal, that for any maximal packing, the vertex set formed by the vertices 
whose packing constraints are met with equality form a cover, that such a packing and cover 
satisfy (i), and that (i) is sufficient to guarantee r-approximation because (ii) is approxi- 
mately satisfied in that every edge has at most r vertices in the cover. Since an optimal 
dual solution can be found in polynomial time by solving the linear program, Hochbaum 
obtained a polynomial-time algorithm. Bar- Yehuda and Even [ [BE81|| observed that sequen- 



tially raising the edge-packing weights as much as possible yields a maximal edge packing, 
thus obtaining a linear-time algorithm. For our algorithm, we relax (i) further, insisting 
only that every vertex in the cover nearly have its constraint met, and we show how to 
simultaneously raise many edge-packing weights so that the packing quickly becomes nearly 
maximal and the weight of the cover formed by the vertices that nearly have their packing 
constraints met with equality is within r/(l — e) of optimal. 

In [|C183|| , Clarkson showed that in a restricted class of graphs, approximation ratios better 



than 2 could be obtained for vertex cover. Clarkson gave the first parallel approximation 
algorithm — a relatively complicated randomized algorithm [ p!91|| . According to Motwani's 



lecture notes on approximation algorithms |[Mot92|| , which contain a survey of results on 
vertex cover, the best approximation ratio known is 2 — , due to Bar- Yehuda and Even 



BE85|] and to Monien and Speckenmeyer |[MS85|| . In [Ho83|l , Hochbaum gives a (2 — 2/fc)- 



approximation algorithm, where k is the maximum vertex degree, and she conjectures that 



2 



there is no polynomial-time c-approximation algorithm for any c < 2 unless P=NP. 

Chvatal's weighted-set-cover algorithm guarantees a set cover of weight at most In A 
times the minimum, where A is the maximum set size [ CTi79 , Lo75 , Jo74j| . Berger, Rompel, 
and Shor |BRS89|1 give a parallel algorithm that guarantees a factor of (1 + e) In A. Their 
algorithm uses a linear number of processors and runs in polylogarithmic time with some 
restrictions on the weights. 

The intuition behind our complexity analysis relies on a lemma of general interest for 
parallel graph algorithms (Lemma |]). The lemma has previously found application in the 
analyses of randomized parallel graph algorithms: Israeli and Itai's maximum-matching 
algorithm [|II86|1 and Alon, Babai and Itai's maximal-independent-set algorithm [ ABI86 . 

In concurrent independent work, Cohen gives a parallel approximation algorithm for max- 
imum flow in shallow networks [|Co92|| . If network flows are viewed as packings of source-to- 



sink paths, then maximal packings correspond to blocking flows. Cohen gives an e-blocking 
flow algorithm that is similar in spirit to our algorithm, although a number of different 
issues arise. In a more recent work, Luby and Nisan give a parallel primal-dual approxi- 
mation algorithm for positive linear programming | |LJN 93j ] . Hochbaum's original algorithm 
can be parallelized by employing Luby and Nisan's algorithm; the resulting algorithm would 
obtain an approximation ratio comparable to ours and have an incomparable running time 
(growing linearly with 1/e, but not with r). Previously, Goldberg et al. [ |GPST92]| gave a 
parallel primal-dual algorithm to find (exactly) maximum-weight bipartite matchings. Their 
algorithm appears to be the first parallel algorithm to use primal-dual techniques, but it 
requires polynomial time. 



1.2 Problem Definitions 

Let G — (y,]? C 2 V ) be a given hypergraph with vertex weights w : V — > Let E(v) 
denote the set of edges incident to vertex v. Let G have m edges. Let r, the rank of G, be 
the maximum size of any edge. (For an ordinary graph, r = 2.) Let M, the size of G, be 
the sum of the edge sizes. For any real-valued function / and a subset S of its domain, let 
f(S) denote Y, xe sf(x)- 



Vertex Cover. A vertex cover for G is a subset C C V of the vertices such that for each 
edge e G E, some vertex in e is in C. The (minimum-weight) vertex-cover problem is to find 
a vertex cover with minimum total weight w(C). 



Edge Packing. An edge packing is an assignment p : E of non-negative weights 

to the edges of the hypergraph such that the total weight p(E(v)) assigned to the edges 
incident to any vertex v is at most w(v). The (maximum-weight) edge-packing problem is 
to find an edge packing maximizing p(E) , the weight of p. The fractional relaxations of the 
vertex cover and edge packing problems are linear programming duals. 
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1.3 Related Problems 



Let C be a family of sets with weights w : C — > 9ft + . Let U denote Usee 

Set Cover. A set cover is a subfamily C C. C such that Usee S = U — in words, every 
element of U is in some set in the cover. The (minimum-weight) set-cover problem is to find 
a set cover of minimum total weight w(C'). 



Element Packing. An element packing is an assignment of non-negative weights to the 
elements such that the total weight assigned to the elements of any set S is at most w(S). 
The (maximum-weight) element-packing problem is to find an element packing maximizing 
the net weight assigned to elements. 



1.4 Equivalences 

The vertex cover problem in hypergraphs is equivalent to the minimum-weight set-cover 
problem as follows. For each S G C we have a vertex vs in the hypergraph. For each element 
x G U, we have an edge that contains vs if and only if x G S. The number of edges m is 
the number of elements. The rank r is the maximum number of sets in which any element 
occurs. The size M is the sum of the set sizes. The dual problems are also equivalent. 



2 Reduction of Vertex Cover to e- Maximal Packing 



We first reduce our problem to the problem of finding what we call an e-maximal packing. 
This reduction generalizes [|Ho82| , [BE81|| , who considered e = 0. 

Lemma 1 (Duality) Let C be an arbitrary vertex cover and p an arbitrary edge packing. 
Thenp(E) < w(C). 

Proof: 

p(E) = £ p(e) < ]T \e n C\ p(e) = £ p(E(v)) < £ w(v) = w(C). 



e€E 



uec 



□ 



Lemma 2 (Approximate Complimentary Slackness) Let C be a vertex cover andp be 
a packing such that p(E(v)) > (1 — e)w(v) for every v G C. Then (1 — e)w(C) < rp(E). By 
duality, the weights of C and p are within a factor o/r/(l — e) from their respective optima. 

Proof: Since (1 — e)w(v) < p(E(v)) for v G C, 

(1 - e)w(C) = (l-e)J2 w(v) < J2 P( E ( V )) = £ l e n C \ P( e ) < rp(E). 

v&C v&C e&E 

□ 

We tighten Lemma [| slightly when the weights are integers. 
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Lemma 3 In Lemma ^, if the weights are integers and e < l/w(V), then the weight of C 
is at most r times the minimum. 

Proof: Let C* be a minimum-weight cover. From Lemma [| (1 — e)w(C) < rw(C*), so 
w(C) < \rw(C*) +ew(C)\ =rw(C*). D 

Given a packing p, define C p = {v EV : p(E(v)) > (1 — e)w(v)}. If C p is a vertex cover, 
then we say p is e-maximal. Note that p is 0-maximal if and only if p is maximal. By Lemma 
^ if p is e-maximal, then C p and p are within a factor of r/(l — e) from their respective 
optima. 



3 The Algorithm 

We have reduced the problem to finding an e-maximal packing. The algorithm maintains 
a packing p and the partial cover C p = {v G V : p(E(v)) > (1 — e)w(v)}. The algorithm 
increases the individual p(e)'s until p is e-maximal and C p is a cover. When a vertex t> 
enters C p , v and the edges containing v are deleted from the hypergraph. Let E p denote the 
set of remaining edges, let E p (v) denote the remaining edges incident to vertex v, and let 
d p (v) be the degree of v in G p = (V, E p ). Define the residual weight w p {v) of vertex v to be 
w{y) — p(E{y)). 

In a single round of the algorithm, for each remaining edge e, p(e) is raised. To ensure 
that p remains a packing, each vertex v limits the increase in each p(e) for e 3 v to at most 
w p (v) / d p (v) . Each p(e) is then increased as much as possible subject to the limits imposed 
by all the v G e. That is, each p(e) is increased by min wge w p (v)/d p (v). The algorithm 
repeats this basic round until p converges to an e-maximal packing. It then returns C p . To 
implement the algorithm we maintain w p instead of p: 



Cover(G = (E,V),w,e) 



Returns a vertex cover of hypergraph G of weight at most 
r/(l — e) times the minimum. 

1 for v G V par-do w p (v) <— w(v); E p (v) E(v); d p (v) \E(v)\ 

2 while edges remain do 

3 for each remaining edge e par-do 5(e) <— m.m vee w p (v)/d p (v) 

4 for each remaining vertex v par-do 

5 w p {v) <- w p (v) - EeeE p (v) S{e) 

6 if w p (v) < ew(v) then 

7 delete v and incident edges, updating E p (-) and d p (-) 

8 return the set of deleted vertices 

As noted above, the limit on the increase in each p(e) ensures that p remains a packing. 
Consequently, the correctness and the approximation ratio of the algorithm are established 
by Lemmas ^| and |[ Using standard techniques ||Ja92|| , each iteration of the while loop 



beginning with q remaining edges can be done in O(lng) time and 0(rq) operations on an 
EREW-PRAM. 
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4 Complexity Analysis 



In this section, we prove our main theorem: 

Main Theorem The algorithm requires 0{r In 2 m In -) time and Mj In 2 m processors, i.e., 
OirM ln^) operations. 

We use a potential function argument. Given a packing p, define 

The next lemma shows that during an iteration of the while loop </> p decreases by at least 
the number of edges remaining at the end of the loop. This is how we show progress. 

Lemma 4 Let p and p' , respectively, be the packing before and after an iteration of the 
while loop. Then <p p — </y > \E P >\. 

Proof: During the iteration, we say that a vertex v limits an incident edge e e E p if v 
determines the minimum in the computation of mm vee w p (v) / d p (v) . For each vertex v, let v 
limit L(v) edges, so that w p /(v) < w p (v)(l — L(v)/d p (v)). Let V denote the set of vertices 
that remain after the iteration. Then 



E U 



pkv) In — - d p ,{v) In — 

ew(v) ew(v) _ 



£fi w p'( v ) 

> E -d p (v)\n(l-L(v)/d p (v)) 

veV 

> 

— \EP'\ 

The second-to-last step follows because — ln(l — x) > x. The last step follows because each 
of the edges that remains is limited by some vertex in V. □ 



Lemma 5 There are at most (1 + r In + In to) iterations. 

Proof: Let p and p', respectively, be the packing before and after any iteration. Let 
a = r In - . 

e 

Clearly </y < |J5 p /|a. By Lemma f|, (f) p > < <p p — \E p i\. Thus, (f) p > < P (1 — l/(a + 1)). 
Before the first iteration, <p p < ma. Inductively, before the zth iteration, 

(fr p <ma(l - l/(o+ <maexp(-(« - l)/(o + l)). 
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The last inequality follows from e x > 1 + x for all x. Fixing i = 1 + \(a + 1) mm] , we have 
exp(— (i — l)/(a + 1)) < exp(— lnm) = l/m, so before the ith iteration, <p p < a. 

During each subsequent iteration, at least one edge remains, so 4> p decreases by at least 
1. Thus, <f) p < before an % + ath iteration can occur. □ 



Time. As each iteration requires O(lnm) time, the above lemma implies that the total 
time is 0(rln 2 mln -). 

Operations. Recall that an iteration with q edges requires 0(rq) operations. Conse- 
quently, the total number of operations is bounded by an amount proportional to r times 
the sum, over all iterations, of the number of edges at the beginning of that iteration. 

By Lemma |], in a given iteration, <j) p decreases by at least the number of edges remaining 
at the end of the iteration. Thus, the sum over all iterations of the number of edges during 
the iteration is at most m + <p p for the initial p. This is m + Mini. Hence there are 
OirM In-) operations. 



Processors. Using standard techniques, the operations can be efficiently scheduled with- 
out increasing the time or the operations by more than a constant, so by the Work-Time 
Scheduling Principle ||J a92|| , the number of processors required is Mj In 2 m — the work 



divided by the time. This establishes the Main Theorem. 



4.1 The Intuition for Ordinary Graphs 

The potential function analysis, while easy to verify, hides an interesting combinatorial 
principle that gives a good intuitive understanding of the algorithm for ordinary graphs. 
Recall that, during a single iteration of the while loop, a vertex v limits an edge e if v 
determines the minimum in the calculation of m.in vee w p (v)/d p (v), and that L(v) denotes 
the number of edges limited by v. If a vertex v limits at least a third of its incident edges, 
then w p (v) decreases by at least one third its value. Call such a vertex good. (After 0(ln -) 
iterations of the while loop in which v is good, v will enter C p .) In a given iteration, few 
vertices might be good. However, at least half of the remaining edges touch good vertices. 
This is a consequence of the following lemma: 



Lemma 6 ( |[II86| , |ABI86|| ) Consider a directed graph. Call a vertex good if more than 



one-third of its incident edges are directed into it. Then at least half of the edges are directed 
into good vertices. 

Proof: If a vertex is not good, call it bad. The in-degree of any bad vertex is at most half 
its out-degree, so the number of edges directed into bad vertices is at most half the number 
of edges directed out of bad vertices. Thus, the number of edges directed into bad vertices 
is at most half the number of edges. Thus, at least half the edges are directed into good 
vertices. □ 
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To see why the lemma applies, imagine directing each remaining edge into a vertex that 
limits it. Then the lemma shows that at least half of the remaining edges touch vertices that 
are good. Thus, in a given iteration, at least half the edges touch vertices whose residual 
weights decrease by more than a factor of 1/3. This, intuitively, is why the algorithm makes 
progress. 

This lemma is of independent interest: it drives the analyses of the running times of 
Israeli and Itai's randomized maximal matching algorithm |[II86|| and of Alon, Babai, and 



Itai's randomized maximal independent set algorithm ||ABI86|| . Interestingly, the natural 



generalization of the lemma to hypergraphs is not strong enough to give an analysis as tight 
as our potential function analysis. 

4.2 Using Integer Arithmetic 

If arithmetic precision is an issue, we can uniformly scale the original (integer) vertex 
weights so that the smallest weight is at least m/e, and then use integer division (tak- 
ing the floor) when computing the w p (v )/d p (v)'s. Essentially the same analysis carries 
through. (If w(v) > m/e, then w p (v) > m while v remains, so w p (v)/d p (v) > 1, hence 
[w p (v) / d p (v )J > (w p (v)/d p (v))/2, and the net reduction in a w p {v) during an iteration is 
at least half what it would have been without taking the floor. Thus, the analysis will go 
through by doubling the potential function.) 

Assuming without loss of generality that e > l/(2w(V)), if the original weights are fc-bit 
integers, then the largest weight after scaling is bounded by 



m 



,fc+i < 2 k+2 mw(V) < 2 2k+3 m \V\. 



Hence the scaled weights are (2k + 3 + log 2 m + log 2 | V|)-bit integers. Subsequently all 
operations involve only integer arithmetic on smaller, non-negative integers. 

Acknowledgments: We would like to thank Michael Luby and an anonymous referee for 
pointing out connections to the randomized algorithms of [ ABI86 , Lu86| and to the work by 
Hochbaum ||Ho82||, respectively. 
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